Multi-Task WaveRNN With an Integrated Architecture for Cross-Lingual Voice Conversion
نویسندگان
چکیده
منابع مشابه
Cross - Lingual Voice Conversion
CROSS-LINGUAL VOICE CONVERSION Cross-lingual voice conversion refers to the automatic transformation of a source speaker’s voice to a target speaker’s voice in a language that the target speaker can not speak. It involves a set of statistical analysis, pattern recognition, machine learning, and signal processing techniques. This study focuses on the problems related to cross-lingual voice conve...
متن کاملFrame alignment method for cross-lingual voice conversion
Most of the existing voice conversion methods calculate the optimal transformation function from a given set of paired acoustic vectors of the source and target speakers. The alignment of the phonetically equivalent source and target frames is problematic when the training corpus available is not parallel, although this is the most realistic situation. The alignment task is even more difficult ...
متن کاملSpectral Mapping Using Artificial Neural Networks for Intra-lingual and Cross-lingual Voice Conversion
CERTIFICATE This is to certify that the work contained in this thesis titled Spectral mapping using have not been submitted to any other Institute or University for the award of any degree or diploma. Date Mr. Kishore Prahallad ii ACKNOWLEDGEMENTS I would like to express my deepest appreciation to Kishore Prahallad, my advisor for his guidance, encouragement and support throughout my duration a...
متن کاملCross-lingual adaptation with multi-task adaptive networks
Posterior-based or bottleneck features derived from neural networks trained on out-of-domain data may be successfully applied to improve speech recognition performance when data is scarce for the target domain or language. In this paper we combine this approach with the use of a hierarchical deep neural network (DNN) network structure – which we term a multi-level adaptive network (MLAN) – and ...
متن کاملMulti-Task Cross-Lingual Sequence Tagging from Scratch
We present a deep hierarchical recurrent neural network for sequence tagging. Given a sequence of words, our model employs deep gated recurrent units on both character and word levels to encode morphology and context information, and applies a conditional random field layer to predict the tags. Our model is task independent, language independent, and feature engineering free. We further extend ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Signal Processing Letters
سال: 2020
ISSN: 1070-9908,1558-2361
DOI: 10.1109/lsp.2020.3010163